RSS Feed Reader - Bulk RSS & Atom Feed Parser avatar

RSS Feed Reader - Bulk RSS & Atom Feed Parser

Pricing

from $3.50 / 1,000 results

Go to Apify Store
RSS Feed Reader - Bulk RSS & Atom Feed Parser

RSS Feed Reader - Bulk RSS & Atom Feed Parser

Read and parse RSS, Atom and RDF feeds in bulk, or auto-discover feeds from any website. Extract thousands of articles with full metadata for news monitoring, content aggregation and AI/RAG pipelines. No API key, export to CSV or JSON.

Pricing

from $3.50 / 1,000 results

Rating

0.0

(0)

Developer

Logiover

Logiover

Maintained by Community

Actor stats

0

Bookmarked

2

Total users

1

Monthly active users

2 days ago

Last modified

Share

RSS Feed Reader 📰 — Bulk RSS, Atom & RDF Feed Parser

Read hundreds of RSS, Atom and RDF feeds in a single run and get one clean row per item. This RSS feed reader / RSS parser fetches every feed you give it (or auto-discovers feeds from plain website URLs), normalizes the different feed dialects, de-duplicates items and returns full metadata for each entry — title, link, GUID, publish date, author, categories, summary, full content and enclosure (podcast/media) URL.

Paste a list of feeds — or just a list of websites — and export the result as JSON, CSV, Excel or via API. No API key, no login, no headless browser. One row per item.

Looking for an RSS to JSON or RSS to CSV converter, a bulk RSS scraper, an Atom feed parser, a feed aggregator or RSS feed discovery for news monitoring? This actor does all of it at scale.


✨ Key features

  • 📡 Reads every common feed format — RSS 2.0, RSS 1.0 / RDF and Atom, all normalized into one consistent item shape.
  • 🔎 Feed auto-discovery — paste a normal website URL and the actor finds its feed via <link rel="alternate"> tags and common paths (/feed, /rss.xml, /atom.xml, and more).
  • 🧾 Full item metadata — title, link, GUID, publish date, author, categories, content snippet, full HTML content and enclosure URL.
  • 🧹 Automatic de-duplication — items repeated across feeds (by GUID/link) are collapsed so your dataset stays clean.
  • 🗓️ Normalized ISO dates — every pubDate is parsed and output as an ISO 8601 timestamp you can sort and filter on.
  • 🛡️ Per-feed error isolation — one broken or slow feed never kills the run; the rest keep going.
  • Fast & cheap — pure HTTP with configurable concurrency, no browser. Many feeds × many items = thousands of rows per run.

💡 Use cases

  • News & brand monitoring — track mentions, topics and outlets across hundreds of news feeds in near real time.
  • Content aggregation — power a reader, dashboard or homepage from dozens of sources at once.
  • AI / RAG / LLM data pipelines — turn fresh feed content into structured rows to feed retrieval, summarization or fine-tuning workflows.
  • Competitive intelligence — watch competitor blogs, changelogs and press feeds and get notified of every new post.
  • Newsletter & podcast tracking — follow Substack/blog feeds and podcast RSS, capturing enclosure (audio/media) URLs per episode.
  • Research & archiving — snapshot what a set of sources published, with timestamps, for analysis or compliance.

📦 What you get

One row per feed item, including:

FieldDescription
feedUrlThe feed the item came from
feedTitleTitle of the source feed
titleItem / article title
linkCanonical URL of the item
guidUnique item identifier (used for de-duplication)
pubDatePublish date, normalized to ISO 8601
authorItem author / creator (when provided)
categoriesTags / categories on the item
contentSnippetPlain-text summary / description
contentFull HTML content (when the feed provides it)
enclosureUrlAttached media URL (podcast audio, image, etc.)
scrapedAtWhen this item was read, ISO 8601

Example output

{
"feedUrl": "https://techcrunch.com/feed/",
"feedTitle": "TechCrunch",
"title": "A new startup wants to reinvent the RSS reader",
"link": "https://techcrunch.com/2026/06/14/rss-reader-startup/",
"guid": "https://techcrunch.com/?p=2847193",
"pubDate": "2026-06-14T16:32:00.000Z",
"author": "Jane Doe",
"categories": ["Startups", "Apps"],
"contentSnippet": "The team behind the project says RSS is overdue for a comeback...",
"content": "<p>The team behind the project says RSS is overdue for a comeback...</p>",
"enclosureUrl": "",
"scrapedAt": "2026-06-15T09:00:00.000Z"
}

🚀 How to use it

  1. Click Try for free / Start.
  2. Paste your Feed URLs — RSS, Atom or RDF feed links, one per line.
  3. (Optional) Turn on Discover feeds from websites and paste plain site URLs (e.g. https://example.com) — the actor will find their feeds for you.
  4. (Optional) Set Max items per feed and Max results to cap the output, or leave at 0 for everything.
  5. Click Save & Start.
  6. Export the dataset as JSON, CSV, Excel or pull it via the REST API.

⚙️ Input

OptionDescriptionDefault
feedUrlsList of RSS / Atom / RDF feed URLs (or website URLs when discovery is on)– (required)
discoverFromWebsitesTreat non-feed URLs as websites and auto-discover their feeds via <link> tags and common feed pathsfalse
maxItemsPerFeedMax items to take from each feed (0 = all)0
maxResultsMax total item rows across all feeds (0 = unlimited)0
maxConcurrencyHow many feeds to fetch in parallel10
proxyConfigurationApify proxy settings (recommended to avoid rate limits)Apify Proxy on

Example input

{
"feedUrls": [
"https://hnrss.org/frontpage",
"http://feeds.bbci.co.uk/news/rss.xml",
"https://www.theverge.com/rss/index.xml",
"https://techcrunch.com/feed/"
],
"discoverFromWebsites": false,
"maxItemsPerFeed": 0,
"maxResults": 0
}

To let the actor find feeds for you, turn discovery on and pass plain sites:

{
"feedUrls": ["https://www.theverge.com", "https://techcrunch.com"],
"discoverFromWebsites": true
}

🔍 How it works

For each input URL the actor fetches the resource over HTTP (through the Apify proxy you configure, to avoid rate limiting and IP blocks).

  • Feeds are parsed regardless of dialect — it handles RSS 2.0, RSS 1.0 / RDF and Atom, mapping each into one consistent item shape with normalized ISO dates.
  • Websites (when Discover feeds from websites is on) are scanned for feeds: the HTML is read for <link rel="alternate" type="application/rss+xml|atom+xml"> hints, and common paths like /feed, /rss, /rss.xml, /atom.xml and /index.xml are probed. Discovered feeds are then parsed like any other.
  • De-duplication collapses items that repeat across feeds using their GUID/link, so the dataset stays clean.
  • Per-feed error isolation means a single broken, empty or unreachable feed is skipped without stopping the run — every other feed still produces rows.

Pure HTTP, high concurrency, no headless browser — so many feeds × many items turn into thousands of rows fast and cheaply.

🧰 Tips & best practices

  • Keep Apify Proxy enabled — many news sites rate-limit by IP, and rotating proxies keep large runs reliable.
  • Use Max items per feed when you only care about the latest posts; leave it at 0 to backfill everything a feed exposes.
  • Set Max results to put a hard ceiling on a run's size (and cost) when you're aggregating hundreds of feeds.
  • Sort or filter the dataset by pubDate (already ISO 8601) to get a clean reverse-chronological timeline across all sources.
  • Schedule the actor to poll your feeds on a cadence — combined with de-duplication it becomes a steady stream of new items only.
  • Not sure of a site's feed URL? Just paste the homepage and turn on Discover feeds from websites.

❓ FAQ

How do I convert RSS feeds to JSON or CSV?

Paste your feed URLs, run the actor, then download the dataset as JSON, CSV or Excel (or pull it via the REST API). Every feed item is one row, so it drops straight into a spreadsheet or a data pipeline — an instant RSS to JSON and RSS to CSV converter.

Can I read many RSS feeds at once?

Yes — this is a bulk RSS reader. Paste hundreds of feed URLs and the actor fetches them in parallel (configurable concurrency) and merges every item into a single dataset, with de-duplication across feeds.

Does it support Atom and RDF feeds, not just RSS?

Yes. It parses RSS 2.0, RSS 1.0 / RDF and Atom feeds and normalizes them all into the same item shape, so you don't have to care which dialect a source uses.

Can it find the RSS feed of a website automatically?

Yes — turn on Discover feeds from websites and paste plain site URLs. The actor reads the page's <link rel="alternate"> tags and probes common feed paths (/feed, /rss.xml, /atom.xml, etc.) to locate the feed, then parses it. It works as a built-in RSS feed discovery tool.

Is this a free RSS reader without an API key?

There's no API key and no login to set up — you just provide feed URLs. It runs on Apify like any other actor, so you only pay for the compute and (optional) proxy you use.

Can I use this for an AI / RAG / LLM news pipeline?

Absolutely. It outputs clean, structured rows (title, link, date, author, full content) that feed directly into RAG / LLM ingestion, summarization or classification pipelines. Schedule it to keep your knowledge base fresh with the latest items.

How does de-duplication work?

Items that appear in more than one feed are collapsed by their GUID/link, so the same article won't show up twice in your dataset even when you aggregate overlapping sources.

📝 Changelog

2026-06-15

  • Initial release — bulk RSS/Atom/RDF feed reader with feed auto-discovery and full item metadata, CSV/JSON export, no API key.